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Tracking  Evasive  Objects  via  A  Search  Allocation  Game 


Huimin  Chen,  Dan  Shen,  Genshe  Chen,  Erik  P.  Blasch,  and  Khanh  Pham 


Abstract —  This  paper  outlines  a  strategy  for  tracking  evasive 
objects  in  discrete  space  using  game  theory  to  allocate  sensor 
resources.  One  or  more  searchers  have  to  allocate  the  effort 
among  the  discrete  cells  to  maximize  the  object  detection  prob¬ 
ability  within  a  finite  time  horizon  or  minimize  the  expected 
search  time  to  achieve  the  desired  detection  probability  under  a 
false  alarm  constraint.  We  review  the  standard  formulations  un¬ 
der  a  sequential  decision  setting  for  finding  stationary  objects. 
Then  we  consider  both  robust  and  optimal  search  strategies 
and  extend  the  standard  search  problem  to  a  two-person  zero- 
sum  search  allocation  game  where  the  object  wants  to  hide 
from  the  searcher  and  the  object  has  incomplete  information 
about  the  searcher’s  remaining  search  time.  We  discuss  how  the 
results  affect  the  sensor  management  and  mission  planning  for 
cooperative  unmanned  aerial  vehicle  (UAV)  search  tasks  and 
provide  simulation  examples  to  show  the  effectiveness  of  the 
proposed  method  compared  with  random  search  strategy. 

I.  INTRODUCTION 


that  cell  i  containing  the  object  using  Bayes’  rule 

Pi(  0) 


Pi{t\y)  = 


Pi(0)  +  (l-Pl(0))e^y^2y) 

In  terms  of  the  log-likelihood  ratio 


we  have 


Zi(t)=Zi(0)-^(nt-2y) 


(1) 


(2) 


Clearly,  Zi(t)  is  also  a  Wiener  process  with  mean  (jj.  2/2o2)t 
if  cell  i  contains  an  object  and  —  (ji2/2a2)t  if  cell  i  does  not 
contain  an  object.  The  variance  is  (p2  /  C2)t  in  either  case. 
The  searcher  needs  to  sequentially  determine  which  cell  to 
look  at  and  for  how  long. 


One  of  the  challenges  in  the  constellation  management 
of  sensor  platforms  and  in  the  path  planning  for  tracking 
evasive  objects  is  an  associated  search  problem:  for  objects 
or  threats  that  have  not  yet  been  identified,  how  to  model 
the  uncertainties  in  the  operational  field  and  allocate  the 
sensing  resources  accordingly?  The  field  of  search  theory 
addresses  this  problem  from  various  aspects:  the  search  space 
can  be  discrete  or  continuous;  the  object  can  be  stationary 
or  moving;  the  sensor  can  have  single-look  or  multiple  looks 
of  the  area  at  a  particular  time  instant.  For  a  comprehensive 
review,  see  [2]. 

A.  Object  Search  Problem  in  Discrete  SPace 

We  consider  a  finite  probability  space  X  =  {1,2 with 
each  point  i  £  X  being  associated  with  a  Wiener  process 
yi(t).  The  n  Wiener  processes  are  independent  with  the  same 
variance  a2t.  If  an  object  is  in  cell  i,  then  E\yj(t)\  =  pt{p> 
0).  Otherwise,  the  mean  of  the  process  y,(f)  is  zero  (p  =  0). 
A  searcher  can  only  focus  on  one  cell  at  any  given  time. 
Assume  that  the  prior  probability  that  cell  i  contains  an  object 
is  Pi{ 0),  then  a  searcher  looking  at  cell  i  from  time  0  to  t  with 
measurement  y  =  yi(t)  will  update  its  posterior  probability 
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B.  Related  Works 

Optimal  search  theory  deals  with  the  following  generic 
problem:  A  single  object  is  hidden  in  one  of  the  n  cells.  Each 
cell  can  provide  the  searcher  with  prior  probability  of  object 
presence  as  well  as  the  detection  and  false  alarm  probabilities 
for  a  fully  specified  sensing  action.  The  goal  is  to  design  a 
search  policy  that  maximizes  the  probability  of  detecting  the 
object  at  the  end  of  the  mission.  The  two-stage  procedure 
was  first  proposed  by  Posner  [8]  where  he  considered  using 
a  radar  to  locate  a  satellite  in  the  sky  containing  n  cells.  The 
optimality  of  greedy  search  which  is  to  look  at  the  cell  with 
the  largest  log-likelihood  ratio  sequentially  was  proved  in 
[11]  and  extended  to  the  dynamical  and  multiple  hypothesis 
testing  cases  in  [3],  Connections  to  compressed  sensing  for 
acquiring  sparse  signals  under  energy  constraint  have  been 
studied  in  [9],  [1]  where  [9]  showed  an  adaptive  search  policy 
for  signals  having  a  sparse  representation  in  the  search  space 
while  [1]  provided  an  optimal  two-stage  procedure  to  recover 
sparse  signals  using  a  convex  criterion. 

The  two-stage  approach  is  mainly  for  a  single  searcher 
looking  for  a  single  stationary  object.  The  k-stage  approach 
is  more  appropriate  for  finding  multiple  objects  in  sparse 
locations  using  a  team  of  searchers  cooperatively.  The  search 
for  intelligent  object  with  dynamic  mobility  requires  to 
study  the  search  allocation  game  and  dedicate  the  sensing 
resources  in  a  not-too-greedy  manner.  A  realistic  mission 
may  contain  multiple  objectives  with  conflicting  interests  in 
variable  environments.  Thus  one  needs  to  integrate  various 
search  policies  for  different  situations  into  one  coherent 
performance  metric  to  study  the  effectiveness  of  the  entire 
mission  planning  process.  Such  a  performance  metric  has 
to  be  comprehensible  and  relatively  easy  to  optimize.  The 
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existing  search  theory  does  not  have  an  immediate  answer 
to  such  a  requirement. 

The  rest  of  the  paper  is  organized  as  follows.  Section  II 
presents  the  optimal  search  strategy  for  finding  stationary 
objects.  Section  III  discusses  the  robust  search  procedure 
when  the  object  distribution  is  unknown.  Section  IV  extends 
the  existing  search  problem  to  a  search-allocation  game 
where  the  evasive  object  motion  is  modeled  as  a  multi-stage 
search-and-hide  game.  Simulation  examples  are  provided  in 
Section  V  and  concluding  summary  is  in  Section  VI. 


II.  Optimal  Search  Strategy  for  Stationary 
Objects 


Consider  a  searcher  who  starts  with  the  most  likely  cell 
i  and  does  not  change  to  another  cell  unless  nit)  drops  by 
8/n  for  some  small  8  >  0.  When  this  decrease  of  the  log- 
likelihood  ratio  occurs,  the  searcher  switches  to  the  most 
likely  cell  j  (which  used  to  be  the  second  best  cell  to  search). 
If  there  is  only  one  object  hiding  in  one  of  the  n  cells  and  the 
maximum  allowable  error  probability  is  e,  then  the  searcher 
will  make  a  decision  that  cell  i  contains  the  object  once 
Zi(t)  >  t(s)  and  the  threshold  is  chosen  by 


r(e)  =  log 


(3) 


As  8  — >  0,  the  above  search  procedure  becomes  optimal  in 
the  sense  of  minimizing  the  expected  search  time  to  reach  a 
decision  with  error  probability  no  larger  than  e  [11].  Assume 
that  the  prior  probability  for  cell  i  to  contain  an  object  is  •- 
for  i  =  1 If  the  searcher  applies  the  optimal  policy  with 
possible  switching  frequency  among  different  cells  being 
arbitrarily  high,  then  the  expected  search  time  under  the 
optimal  procedure  is 


T{£) 


la2 


(«  -  2) 


(l-2e)log 


n  —  1  —  ne\ 
n  —  1  ) 


(4) 


and  we  can  see  that  the  expected  search  time  scales  like 
O  ^^-logjj  in  the  asymptotic  regime. 

A  practical  procedure  to  approximate  the  optimal  search 
policy  can  be  described  as  a  two-stage  approach  [8]: 

•  Stage  1:  Search  each  cell  with  a  small  fixed  time  ti(e) 
to  update  the  posterior  probability  for  each  cell. 

•  Stage  2:  Search  the  cells  in  the  order  of  decreasing 
posterior  probabilities  for  time  ro  (e )  and  declare  the 
finding  of  object  whenever  the  log-likelihood  ratio  ex¬ 
ceeds  T(£). 

The  two-stage  approach  has  only  twice  of  the  expected 
search  time  by  the  optimal  policy  in  the  asymptotic  regime 
and  the  same  scaling  law  independent  of  n. 

In  a  discrete-time  setting,  one  assumes  that  the  searcher 
has  to  spend  at  least  T  seconds  in  any  cell  and  then  decides 
whether  to  look  at  this  cell  for  another  T  seconds  or  search  a 
different  cell.  In  this  case,  the  searcher  will  always  choose  the 


cell  with  the  largest  log-likelihood  ratio  at  the  decision  time. 
The  searcher  will  declare  the  finding  of  object  whenever 
the  log-likelihood  ratio  exceeds  f(e).  If  at  the  end  of  a  T 
second  search  on  a  cell,  the  searcher  has  to  quantify  its 
belief  on  whether  the  cell  contains  an  object  or  not,  then 
the  problem  becomes  a  sequential  decision  with  quantized 
input  —  instead  of  the  actual  log-likelihood  ratio,  only  two 
quantized  values  of  0  and  1  are  allowed.  The  optimal  search 
policy  remains  to  be  a  greedy  one  by  focusing  on  the  cell 
with  the  highest  cumulative  score  at  each  step  [7].  The 
quantization  rule  at  each  step  is  assumed  to  be  fixed,  which 
leads  to  the  false  alarm  probability  a  and  miss  probability 
/3.  Since  /?,■( 0)  is  usually  fairly  small  for  any  cell  i,  the  most 
informative  quantization  rule  should  operate  at  the  condition 

=  Pii 0) 

1  -ft  l-Pi(O) 

for  any  cell  i. 

Next,  we  consider  another  asymptotic  regime  where  the 
number  of  cells  n  becomes  very  large  and  the  number 
of  objects  hiding  among  the  n  cells  increases  sublinearly 
according  to  n1~c  where  c  £  (0, 1)  is  a  constant  scaling  factor. 
If  the  searcher  distributes  its  effort  equally  among  the  n  cells, 
we  will  have  the  following  simplified  observation  model 

Yt  ~  ,/F (/J,,  1),  i  =  1 

where  jXj  =  0  if  cell  i  does  not  contain  any  object  while 
ju,-  =  ju  >  0  if  cell  i  contains  an  object.  Note  that  U  can  be 
interpreted  as  the  normalized  signal-to-noise  ratio  (SNR).  We 
assume  that  ju  scales  like  O  (v^rlogn)  where  r  depends  on 
the  scaling  factor  c  on  how  the  number  of  objects  grows  as 
the  number  of  cells  increases.  For  any  search  procedure  to 
declare  as  many  objects  as  possible  and  maintain  the  false 
discovery  rate  to  grow  in  a  lower  scaling  law,  we  need  to  use 
two  performance  metrics  to  characterize  the  desired  asymp¬ 
totic  property.  Define  the  false  discovery  proportion  (FDP) 
to  be  the  number  of  incorrectly  declared  objects  relative  to 
the  total  number  of  object  declarations.  The  non-discovery 
proportion  (NDP)  is  defined  as  the  number  of  objects  missed 
by  the  searcher  relative  to  the  total  number  of  no-object 
declarations.  A  searcher  is  said  to  be  asymptotically  efficient 
if  both  FDP  and  NDP  approach  zero  as  n  — >  Intuitively, 
the  normalized  SNR  has  to  be  high  enough  for  the  searcher  to 
design  an  efficient  search  policy.  In  fact,  if  r  <  c,  no  searcher 
can  be  made  asymptotically  efficient.  On  the  other  hand,  if 
r  >  c,  then  a  searcher  using  coordinate-wise  thresholding  rule 
to  declare  the  object  on  each  cell  is  asymptotically  efficient 
[5].  The  interesting  case  lies  at  the  boundary  r  =  c  where  the 
design  of  optimal  search  policy  is  highly  related  to  sparse 
signal  recovery  and  compressed  sensing  [4], 

Consider  a  normalized  observation  model  for  cell  i  that 
allows  multiple  looks 

Y,U)  =  7#^+#,  i  =  j  =  1  (6) 

where  ~  c/C(0, 1)  is  the  additive  white  Gaussian  noise 
and  (j)j  J>  is  related  to  the  signal-to-noise  ratio  that  has  been 
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dedicated  in  the  j-th  sensing  action  to  cell  i.  Without  loss 
of  generality,  we  impose  the  total  energy  constraint  for  the 
whole  search  effort  given  by 

(7) 

ij 


Note  that  setting 

4>ij)  =  p'  =  1  ,...,n,j=  1 

is  equivalent  to  a  single  look  for  each  cell  with  < pi  =  —  owing 
to  the  independence  of  the  noises  in  the  multiple  looks  and 
the  total  energy  constraint. 

We  consider  a  sequential  search  procedure  that  takes  the 
advantage  of  the  multiple  looks  in  the  spirit  of  the  two- 
stage  method.  We  apply  a  portion  of  the  energy  to  crudely 
search  all  cells;  eliminate  a  fraction  of  the  cells  that  appear 
least  promising  from  further  consideration;  and  iterate  this 
procedure  for  several  times,  at  each  step  searching  only  those 
cells  retained  from  the  previous  step.  The  algorithm  runs  in 
the  following  manner. 

.  Input:  Number  of  total  stages  k  and  energy  budget  E 
for  stage  j  such  that  'Z!j=\  E^)  <  E. 

•  Initialization:  Index  set  of  cells  to  be  searched  1^  = 

{1,2, 

•  Adaptive  Sensing:  At  stage  j,  search  cell  i  with  equal 
effort  if  i  £  /W  and  obtain  the  measurement  Y^\ 

Update  the  index  set  by  i  £  I Z+1^  if  >  0  for  all  the 
cells  being  searched. 

•  Output:  The  final  index  set  /W  which  is  very  likely  to 
contain  most  of  the  objects. 

In  order  to  retain  the  signal  component  at  each  stage,  we 
need  to  allocate  a  large  portion  of  sensing  energy  to  the 
first  step.  Due  to  the  sparsity  of  the  objects,  most  cells 
will  be  eliminated  in  the  subsequent  search  stages  with 
reduced  energy.  One  possible  energy  allocation  design  is  to 
exponentially  decrease  the  energy  allocated  on  each  cell  from 
one  stage  to  the  next.  For  example,  given  a  design  parameter 
d  £  (0, 1),  we  have 


EU)  = 


7=i, i 

j=k 


(8) 


III.  Robust  Search  in  Discrete  Space  with 
Unknown  Object  Distribution 


Consider  searching  an  object  in  one  of  n  discrete  cells 
where  the  probability  of  finding  the  object  in  cell  i  within 
the  search  time  t  given  that  the  object  is  in  cell  i  is  denoted  by 
b(i,t).  Note  that  b(i,t)  is  often  called  the  detection  function 
for  cell  i  and  satisfies 


d2b(i,t)  n  db(i,t)  db(i,t) 

dt 2  ’  dt  [=q  ’  dt 


(10) 


A  popularly  used  non-detection  function  qj(t)  =  1  —b(i.t)  is 
exponential 

£/,-(?)  =  i=  (11) 


where  rj,  is  an  indicator  factor  measuring  how  effective  a 
unit  resource  is  in  the  /-th  cell  for  detecting  the  object.  If  the 
total  search  time  is  T,  then  we  are  interested  in  how  much 
effort  in  terms  of  search  time  f;  should  be  allocated  to  cell  i 
in  order  to  maximize  the  overall  object  detection  probability. 
Of  course  this  depends  on  the  prior  probability  p,  that  the 
object  is  in  cell  i.  Given  any  probability  distribution  {/?,•}"_ 1? 
the  optimal  search  strategy  can  be  written  as 

n 

J  =  max  ^  pjb(i.  ti) 

{'.}  ,= i 

n 

subject  to  ^  f,  <  T 

i=  1 


and  it  has  a  unique  solution  for  the  case  of  Pi>  0  given  by 
log(p,-T7;)  _ 


ti  =  c- 


Vi 


,  i  =  1 ,...,« 


where  c  is  the  normalizing  constant  given  by 

T 


S”=tlog  (Pittdhi 


(13) 


However,  when  the  true  distribution  {/?,}  is  unknown  while 
the  searcher  assumes  a  different  distribution  {/?;}  to  derive 
the  optimal  search  procedure,  the  resulting  detection  proba¬ 
bility  becomes 

7  =  i>(  l-e-W)  (14) 

i=  1 


where  the  search  time  for  cell  i  is  given  by 


log(PiV,)/rii 
S'Ljlog  (piTli)/Tli 


(15) 


that  satisfies  Y}j=\  =  E.  In  this  case,  when  r>c/(  2- 

d)k~l ,  the  k-stage  procedure  guarantees  that  FDP  and  NDP 
approach  zero  with  probability  one  as  n  — >  Note  that  when 
r  ->  c,  we  need  at  least 


k  =  0 


log(logn)  \ 
log(2  -d)J 


(9) 


stages  to  reliably  identify  the  sparse  locations  of  the  objects. 
The  proof  follows  the  ideas  presented  in  [5]  and  is  omitted 
due  to  page  limit. 


It  is  clear  that  J  <  J.  Without  knowing  { p , } ,  one  can  choose 
{pi}  to  minimize  /  —  /for  all  possible  distributions  {/?,}.  In 
the  worst  case,  assuming  that  iji  =  r\2  =  •  •  •  =  T]n  —  U  we 
have 


J-J  =  e 


-T/n  ' 


i=  1 


1  jn 


l/n 


n  pi 

<i=  i 


pr 


Up. 

Ki=  1 


I  Pi 


(16) 

The  performance  gap  increases  as  n  increases.  This  indicates 
that  the  robust  solution  can  be  significantly  worse  than  the 
optimal  solution  when  knowing  Thus  when  the  evasive 
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object  has  certain  level  of  intelligence  to  select  its  {/?,}, 
it  will  have  the  incentive  to  do  so  in  order  to  minimize 
the  detection  probability  of  the  searcher.  We  will  discuss 
an  alternative  formulation  of  the  search  problem  via  search- 
allocation  game  in  the  next  section. 

IV.  Evasive  Object  Search  via 
Search-Allocation  Game 

When  objects  can  move  among  those  cells  being  searched, 
the  problem  becomes  a  search  allocation  game  where  two 
players,  a  searcher  and  an  evader,  join  the  game.  At  the  initial 
time,  the  searcher  has  a  total  energy  constraint  E.  Using 
this  total  energy,  the  searcher  has  to  allocate  resources  in 
the  search  space  to  detect  the  evader.  The  evader  has  an 
initial  energy  eo.  The  evader  can  move  in  the  search  space 
under  energy  constraint  as  well  as  some  other  factors  to  be 
described  next.  The  strategies  and  information  sets  of  the 
players,  the  payoff  function  and  the  process  of  the  game  are 
as  follows. 

•  At  the  beginning  of  time  k,  the  searcher  obtains  the 
information  about  the  evader’s  position,  say,  cell  z,  and 
his  residual  energy.  At  the  same  time,  the  evader  is 
informed  on  the  searcher’s  residual  budget. 

•  Then  the  evader  makes  a  decision  to  move  from  the 
cell  z  to  its  neighborhood  cells  N(i)  in  a  probabilistic 
manner.  Specifically,  he  will  spend  e(i,j )  to  move  from 
cell  i  to  cell  j  assuming  that  e(i,j)  >  0  if  i  7^  j  and 
e(i.  z)  =  0. 

.  The  searcher  allocates  his  resources  based  on  his  hy¬ 
pothesized  estimate  of  the  cell  that  the  evader  moves  to. 
However,  this  allocation  has  to  take  his  residual  energy 
into  account. 

•  If  the  evader  is  in  cell  i  and  the  x  amount  of  resource  is 
allocated  there,  then  the  searcher  can  detect  the  evader 
with  probability  1  —  c/,-(x)  where  the  non-detection  func¬ 
tion  qt(x)  is  monotonously  decreasing  in  x.  Typically, 
we  can  model  the  miss  probability  by 

qi(x)  =  e~r'iX  (17) 

where  rj,-  is  an  indicator  on  how  effective  a  unit  resource 
is  in  the  z-th  cell  for  detecting  the  evader.  When  the 
searcher  detects  the  evader,  he  receives  payoff  1  and 
the  evader  loses  the  same  amount.  At  this  moment,  the 
game  is  terminated. 

.  Unless  the  detection  occurs  at  time  k,  the  game  will 
proceed  to  the  next  stage  k—  1  until  it  reaches  k  =  0. 

The  above  formulation  is  clearly  a  multi-stage  zero-sum 
stochastic  game.  A  general  stochastic  game  may  be  played 
forever;  however,  it  terminates  with  probability  one  under  the 
assumption  that  it  has  positive  probability  of  termination  at 
any  stage  and  the  value  of  the  game  is  uniquely  determined 
[10]. 

Let  pi  be  the  probability  that  the  evader  chooses  cell  z 
for  his  hiding  location.  Let  c,  be  the  budget  that  it  costs  to 
distribute  a  unit  resource  in  cell  z.  Let  17 ,  be  the  effectivity 
that  unit  resource  has  on  the  detection  of  evader  in  cell  z. 
Let  <j>j  be  the  search  resources  allocated  to  cell  z.  Let  q,  be 


the  value  of  the  game  in  the  state  that  evader  is  in  cell  z  and 
the  searcher  allocates  <pl  resource  on  it  with  the  criterion  of 
miss  probability.  In  one-stage  game,  we  have  the  following 
minimax  problem. 

minmax  V  Pi^e~ ^  (18) 

(<M  {Pi}  iesif 

subject  to 

Pi  >  0:  X  Pi  =  1  (19) 

ft  >  0,  X  *4*  <  E  (20) 


This  minimax  problem  has  a  unique  solution  given  by  the 
following  water-filling  procedure. 


<t>i 


1 

Vi 


(21) 


Pi  = 


ci/ig 

0, 


P<& 

p>6 


(22) 


where  [x] +  = 
equation 


max{x,  0}  and  p  is  determined  by  the  following 


Y  Ci 


=  E 


(23) 


We  can  apply  the  above  result  of  the  single-stage  game 
recursively  to  the  multi-stage  game  and  the  solution  becomes 
an  iterative  water-filling  procedure.  Note  that  the  above  game 
theoretic  formulation  assumes  that  the  evader’s  position  is 
exposed  to  the  searcher  at  every  stage,  which  is  clearly  a 
disadvantage  to  the  evader.  A  more  challenging  problem 
would  be  that  the  evader’s  position  is  only  revealed  to  the 
searcher  at  the  initial  time.  Then  it  becomes  a  standard 
pursuit-evasion  game  where  only  long-term  strategies  for 
both  players  are  meaningful  in  the  analysis  [6],  [10]. 


V.  Simulation  Study 

To  demonstrate  the  effectiveness  of  the  optimal  strategy 
for  the  search-allocation  game,  we  implemented  the  resource 
allocation  method  in  an  intelligence,  surveillance,  and  recon¬ 
naissance  (ISR)  scenario  where  the  searcher  needs  to  find  a 
moving  object  with  prior  information  that  it  may  hide  near 
one  of  the  bridges  (Fig.  la).  We  first  divided  the  search  space 
into  10x10  cells  and  assigned  the  relevant  parameters  based 
on  the  terrain  feature  and  importance  of  each  cell.  Since  17 ,  is 
an  indicator  on  how  effective  a  unit  resource  is  in  the  i-th  cell 
for  detecting  the  evader,  we  assigned  the  value  for  each  cell 
as  shown  in  Fig.  lc.  Note  that  on  each  cell,  value  1  means  the 
least  effective;  value  3  indicates  the  most  effective  (in  this 
case,  the  cell  is  on  the  river),  and  2  is  an  average  effective 
level  without  complex  urban  buildings.  The  value  c ,  indicates 
the  searcher’s  cost  for  allocating  unit  resource  on  the  z-th  cell. 
It  will  be  1  if  cell  i  is  on  river  and  2  when  including  urban 
environment  with  buildings.  The  searcher’s  cost  to  assign 
unit  resource  on  each  cell  is  shown  in  Fig.  Id.  The  game 
value  |  represents  the  importance  of  the  z-th  cell.  Since  the 
searcher’s  top  level  objective  is  to  capture  two  bridges,  the 
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c.  effectivity  of  unit  resource  for  each  cell  d.  searcher’s  cost  of  assigning  unit  resource  for  each  cell 

Fig.  1.  An  illustrative  scenario  for  searching  an  evasive  object 


cells  near  two  bridges  have  relatively  larger  values.  The  value 
for  each  cell  is  shown  in  Fig.  lb. 

To  evaluate  the  performance  of  the  proposed  search  strat¬ 
egy  (denoted  by  game  search),  we  compare  it  with  the  ran¬ 
dom  search  method  where  cells  are  randomly  selected.  The 
probability  of  selecting  each  cell  is  based  on  its  importance. 
At  the  end  of  each  stage,  the  searcher  will  either  declare 
the  acquisition  of  the  evader  in  a  particular  cell  or  continue 
allocating  his  energy  to  cells  until  running  out  of  the  budget. 
For  a  fixed  budget  E,  we  performed  1000  Monte  Carlo 
simulations  for  each  algorithm  and  estimated  the  detection 
probability  where  evasive  object  applies  the  optimal  strategy 
to  the  minimax  game  in  each  stage.  The  estimated  detection 
probabilities  after  playing  20  stages  are  shown  in  Fig.  2. 
It  is  expected  that  the  detection  probability  will  increase 
as  the  searcher  has  more  budget  to  allocate  to  the  search 
area.  However,  we  observe  some  fluctuations  in  both  curves 
using  game  search  and  random  search  due  to  inadequate 
Monte  Carlo  runs.  Nevertheless,  game  search  outperforms 
random  search  in  all  cases  for  E  ranging  from  5  to  50. 
Fig.  3  compares  the  detection  probability  at  the  end  of  each 
stage  when  the  searcher  has  the  budget  E  =  5.  The  miss 
probability  using  game  search  is  54.1%.  This  means  that 
there  is  54.1%  chance  that  the  searcher  can  not  find  the  object 
after  using  all  of  the  budget.  Note  that  the  miss  probability 
is  65%  using  random  search  under  the  same  condition.  We 


can  also  see  that  the  probability  of  detecting  the  object  in 
early  stages  using  game  search  is  usually  much  higher  than 
that  using  random  search.  For  different  budget  constraint, 
we  summarized  the  comparative  results  for  the  first  five 
stages  in  Table  1.  It  is  clear  that  game  search  outperforms 
random  search  in  the  following  two  aspects:  1)  it  yields  larger 
detection  probability  in  each  stage  for  the  first  five  stages; 
and  2)  it  also  has  larger  detection  probability  for  extended 
stages  so  that  the  overall  miss  probability  is  smaller  than  that 
using  random  search.  This  is  mainly  due  to  the  intelligent 
behavior  of  the  evasive  object:  it  has  a  tendency  to  hide  to 
a  cell  where  the  searcher  needs  to  allocate  more  resource 
in  order  to  make  a  detection.  This  confirms  the  theoretical 
analysis  that  the  searcher’s  expected  game  value  can  not 
increase  in  any  stage  by  deviating  from  the  optimal  search 
strategy  derived  from  the  search-allocation  game. 

VI.  Discussion  and  Conclusions 

We  considered  the  search  problem  in  a  sequential  decision 
setting  where  the  searcher  has  to  determine  which  cell  to 
perform  the  sensing  action  at  any  given  time  based  on  the 
measurements  accumulated  so  far.  For  a  single  stationary 
object  located  in  one  of  the  n  cells,  the  two-stage  approach 
has  close-to-optimal  performance  in  terms  of  the  minimum 
expected  time  to  declare  the  object  location  with  a  given 
error  rate.  For  acquiring  a  few  objects  in  sparse  locations, 
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TABLE  I 


Comparisons  of  Detection  Probabilities 


Search 

Search 

Detection  Probabilities 

Budget 

Algorithm 

Stage  1 

Stage  2 

Stage  3 

stage  4 

Stage  5 

First  5 

Total 

E  —  S 

Random 

6.5% 

6.4% 

4.7% 

3.0% 

1.7% 

22.3% 

34.4% 

Game 

10.6% 

11.5% 

9.0% 

10.0% 

4.8% 

45.9% 

49.4% 

£-10 

Random 

9.5% 

10.3% 

6.5% 

6.3% 

3.5% 

36.1% 

51.6% 

Came 

12.3% 

14.3% 

9.5% 

~93% 

5.9% 

51.3% 

62.3% 

E-  20 

Random 

10.7% 

10.7% 

8.5% 

7.3% 

5.6% 

42.8% 

60.3% 

Game 

16.1% 

15.5% 

7.8% 

6.6% 

5.6% 

51.6% 

75.9% 

o 

II 

Random 

11.6% 

12.5% 

10.5% 

7.6% 

7.5% 

49.7% 

75.4% 

Game 

IS. 7% 

15.6% 

9.7% 

8.5% 

~S3% 

58.8% 

83.4% 

Fig.  2.  Comparison  of  detection  probabilities  with  different  resource 
constraints 

the  fc-stage  procedure  ensures  the  false  discovery  proportion 
(FDP)  and  non-discovery  proportion  (NDP)  approach  zero 
with  probability  one  in  the  asymptotic  regime,  which  meets 
the  best  scaling  law  of  object  sparsity.  When  the  object 
can  hide  from  cell  to  cell  during  different  stages  of  the 
search  procedure,  the  search  allocation  game  in  the  two- 
player  zero-sum  complete-information  setting  has  a  unique 
minimax  solution  corresponding  to  an  iterative  water-filling 
procedure  by  allocating  the  sensing  effort  to  those  cells 
with  relatively  larger  detection  probabilities.  A  simulated  ISR 
example  demonstrated  the  effectiveness  of  using  the  minimax 
solution  to  the  search-allocation  game  for  acquiring  the  eva¬ 
sive  object.  The  minimax  solution  significantly  outperforms 
the  random  search  method  in  terms  of  the  probability  of 
detecting  the  evasive  object  in  the  repeated  game. 
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